Regular Expressions
For examples of regular expressions, refer toMetadata Replacement
Metacharacters
The following characters are called metacharacters, and they have special meaning.
/\ $ + ? . * ( ) [ ] { } I \
Other characters are called normal characters, and they do not have special meaning.
To use a metacharacter as normal character, place a backslash (\) in front of the metacharacter.
Character |
Meaning |
---|---|
. (period) |
Match any character (except Newline). |
[...] |
Match any single character within the brackets. |
[/\...] |
Match any single character outside the brackets. |
/\ |
Match the beginning of the line. |
$ |
Match the end of the line (or before Newline at the end). |
\A |
Match only at the beginning of string. |
\Z |
Match only at the end of string (or before Newline at the end). |
\G |
Match only at position () (for example at the end-of-match position of prior m//g). |
\b |
Match a word boundary. |
\B |
Match a non-word boundary. |
\w |
Match any "word" character (alphanumeric plus "_"). |
\W |
Match any non-"word" character. |
\s |
Match any whitespace character. |
\S |
Match any non-whitespace character. |
\d |
Match any digit character (0-9). |
\D |
Match any non-digit character. |
\1, \2, . |
Used to refer to previous group. |
|
Escape sequence. Match extended Unicode "combining character sequence". Equivalent to (?:\PM\pM*). |
* |
Match 0 or more times. |
*? |
Match 0 or more times (shortest match). |
+ |
Match 1 or more times. |
+? |
Match 1 or more times (shortest match). |
? |
Match 1 or 0 times. |
? ? |
Match 0 or 1 time. |
{n,m} |
Match at least n but not more than m times. |
{ n,m }? |
Match at least n but not more than m times (shortest match). |
(...) |
Grouping. |
I |
Alternation. |
(?:regexp) |
A group that cannot be referred to by \1, \2, ... |
(?=regexp) |
Match following expression to "regexp". |
(?!=regexp) |
Match following expression to anything but "regexp". |
Escape Sequences
You can use the following escape sequences.
Character |
Meaning |
---|---|
\0 |
Null character. |
\xhh |
Hex character. |
\n |
Newline. |
\t |
Tab. |
\b |
Match a word boundary. |
\ooo |
Octal character. |
\cC |
Control character. |
\r |
Return. |
\f |
Form feed. |
\a |
Alarm (bell). |
You can use the following escape sequences as alternative strings.
Character |
Meaning |
---|---|
\u |
Make the next character uppercase. |
\l |
Make the next character lowercase. |
\U |
Make all following characters uppercase until the next \E. |
\L |
Make all following characters lowercase until the next \E. |
\E |
End case modification, i.e., \U and \L. |